智能论文笔记

Machine Learning based Framework for Robust Price-Sensitivity Estimation with Application to Airline Pricing

Ravi Kumar , Shahin Boluki , Karl Isler , Jonas Rauch , Darius Walczak

分类： (统计)机器学习 | 机器学习

2022-05-04

We consider the problem of dynamic pricing of a product in the presence of feature-dependent price sensitivity. Developing practical algorithms that can estimate price elasticities robustly, especially when information about no purchases (losses) is not available, to drive such automated pricing systems is a challenge faced by many industries. Based on the Poisson semi-parametric approach, we construct a flexible yet interpretable demand model where the price related part is parametric while the remaining (nuisance) part of the model is non-parametric and can be modeled via sophisticated machine learning (ML) techniques. The estimation of price-sensitivity parameters of this model via direct one-stage regression techniques may lead to biased estimates due to regularization. To address this concern, we propose a two-stage estimation methodology which makes the estimation of the price-sensitivity parameters robust to biases in the estimators of the nuisance parameters of the model. In the first-stage we construct estimators of observed purchases and prices given the feature vector using sophisticated ML estimators such as deep neural networks. Utilizing the estimators from the first-stage, in the second-stage we leverage a Bayesian dynamic generalized linear model to estimate the price-sensitivity parameters. We test the performance of the proposed estimation schemes on simulated and real sales transaction data from the Airline industry. Our numerical studies demonstrate that our proposed two-stage approach reduces the estimation error in price-sensitivity parameters from 25\% to 4\% in realistic simulation settings. The two-stage estimation techniques proposed in this work allows practitioners to leverage modern ML techniques to robustly estimate price-sensitivities while still maintaining interpretability and allowing ease of validation of its various constituent parts.

translated by 谷歌翻译

Computational Charisma -- A Brick by Brick Blueprint for Building Charismatic Artificial Intelligence

Björn W. Schuller , Shahin Amiriparian , Anton Batliner , Alexander Gebhard , Maurice Gerzcuk , Vincent Karas , Alexander Kathan , Lennart Seizer , Johanna Löchner

分类：人工智能 | 计算机视觉 | 机器学习

2022-12-31

Charisma is considered as one's ability to attract and potentially also influence others. Clearly, there can be considerable interest from an artificial intelligence's (AI) perspective to provide it with such skill. Beyond, a plethora of use cases opens up for computational measurement of human charisma, such as for tutoring humans in the acquisition of charisma, mediating human-to-human conversation, or identifying charismatic individuals in big social data. A number of models exist that base charisma on various dimensions, often following the idea that charisma is given if someone could and would help others. Examples include influence (could help) and affability (would help) in scientific studies or power (could help), presence, and warmth (both would help) as a popular concept. Modelling high levels in these dimensions for humanoid robots or virtual agents, seems accomplishable. Beyond, also automatic measurement appears quite feasible with the recent advances in the related fields of Affective Computing and Social Signal Processing. Here, we, thereforem present a blueprint for building machines that can appear charismatic, but also analyse the charisma of others. To this end, we first provide the psychological perspective including different models of charisma and behavioural cues of it. We then switch to conversational charisma in spoken language as an exemplary modality that is essential for human-human and human-computer conversations. The computational perspective then deals with the recognition and generation of charismatic behaviour by AI. This includes an overview of the state of play in the field and the aforementioned blueprint. We then name exemplary use cases of computational charismatic skills before switching to ethical aspects and concluding this overview and perspective on building charisma-enabled AI.

translated by 谷歌翻译

Automatic Emotion Modelling in Written Stories

Lukas Christ , Shahin Amiriparian , Manuel Milling , Ilhan Aslan , Björn W. Schuller

分类：自然语言处理

2022-12-21

Telling stories is an integral part of human communication which can evoke emotions and influence the affective states of the audience. Automatically modelling emotional trajectories in stories has thus attracted considerable scholarly interest. However, as most existing works have been limited to unsupervised dictionary-based approaches, there is no labelled benchmark for this task. We address this gap by introducing continuous valence and arousal annotations for an existing dataset of children's stories annotated with discrete emotion categories. We collect additional annotations for this data and map the originally categorical labels to the valence and arousal space. Leveraging recent advances in Natural Language Processing, we propose a set of novel Transformer-based methods for predicting valence and arousal signals over the course of written stories. We explore several strategies for fine-tuning a pretrained ELECTRA model and study the benefits of considering a sentence's context when inferring its emotionality. Moreover, we experiment with additional LSTM and Transformer layers. The best configuration achieves a Concordance Correlation Coefficient (CCC) of .7338 for valence and .6302 for arousal on the test set, demonstrating the suitability of our proposed approach. Our code and additional annotations are made available at https://github.com/lc0197/emotion_modelling_stories.

translated by 谷歌翻译

METEOR Guided Divergence for Video Captioning

Daniel Lukas Rothenpieler , Shahin Amiriparian

分类：计算机视觉 | 自然语言处理 | 机器学习

2022-12-20

Automatic video captioning aims for a holistic visual scene understanding. It requires a mechanism for capturing temporal context in video frames and the ability to comprehend the actions and associations of objects in a given timeframe. Such a system should additionally learn to abstract video sequences into sensible representations as well as to generate natural written language. While the majority of captioning models focus solely on the visual inputs, little attention has been paid to the audiovisual modality. To tackle this issue, we propose a novel two-fold approach. First, we implement a reward-guided KL Divergence to train a video captioning model which is resilient towards token permutations. Second, we utilise a Bi-Modal Hierarchical Reinforcement Learning (BMHRL) Transformer architecture to capture long-term temporal dependencies of the input data as a foundation for our hierarchical captioning module. Using our BMHRL, we show the suitability of the HRL agent in the generation of content-complete and grammatically sound sentences by achieving $4.91$, $2.23$, and $10.80$ in BLEU3, BLEU4, and METEOR scores, respectively on the ActivityNet Captions dataset. Finally, we make our BMHRL framework and trained models publicly available for users and developers at https://github.com/d-rothen/bmhrl.

translated by 谷歌翻译

Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning

Mostafa Shahin , Beena Ahmed , Julien Epps

分类：自然语言处理

2022-10-19

One of the major challenges in acoustic modelling of child speech is the rapid changes that occur in the children's articulators as they grow up, their differing growth rates and the subsequent high variability in the same age group. These high acoustic variations along with the scarcity of child speech corpora have impeded the development of a reliable speech recognition system for children. In this paper, a speaker- and age-invariant training approach based on adversarial multi-task learning is proposed. The system consists of one generator shared network that learns to generate speaker- and age-invariant features connected to three discrimination networks, for phoneme, age, and speaker. The generator network is trained to minimize the phoneme-discrimination loss and maximize the speaker- and age-discrimination losses in an adversarial multi-task learning fashion. The generator network is a Time Delay Neural Network (TDNN) architecture while the three discriminators are feed-forward networks. The system was applied to the OGI speech corpora and achieved a 13% reduction in the WER of the ASR.

translated by 谷歌翻译

Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results

Lukas Christ , Shahin Amiriparian , Alexander Kathan , Niklas Müller , Andreas König , Björn W. Schuller

分类：机器学习 | 自然语言处理 | 计算机视觉

2022-09-28

幽默是人类情感和认知的重要因素。它的自动理解可以促进更自然的人类设备互动和人工智能的人性化。当前的幽默检测方法仅基于分阶段数据，使其不适用于“现实世界”应用程序。我们通过引入新颖的Passau自发足球教练幽默（Passau-SFCH）数据集来解决这种缺陷，包括大约11个小时的录音。在马丁的幽默风格问卷中提出的幽默及其尺寸（情感和方向）的存在，请注释Passau-SFCH数据集。我们进行了一系列实验，采用了经过预定的变压器，卷积神经网络和专家设计的功能。分析了每种模式（文本，音频，视频）的表现，以进行自发幽默识别，并研究了它们的互补性。我们的发现表明，对于对幽默及其情感的自动分析，面部表情是最有希望的，而幽默方向可以通过基于文本的功能进行建模。结果揭示了各种主题之间的差异，突出了幽默用法和风格的个性。此外，我们观察到决策级融合会产生最佳认可结果。最后，我们在https://www.github.com/eihw/passau-sfch上公开代码。可以根据要求获得Passau-SFCH数据集。

translated by 谷歌翻译

On the Stability Analysis of Open Federated Learning Systems

Youbang Sun , Heshan Fernando , Tianyi Chen , Shahin Shahrampour

分类：机器学习

2022-09-25

我们考虑开放的联合学习（FL）系统，客户可以在FL过程中加入和/或离开系统。鉴于当前客户端数量的差异，在开放系统中不能保证与固定模型的收敛性。取而代之的是，我们求助于一个新的性能指标，该指标称我们的开放式FL系统的稳定性为量，该指标量化了开放系统中学习模型的幅度。在假设本地客户端的功能强烈凸出和平滑的假设下，我们从理论上量化了两种FL算法的稳定性半径，即本地SGD和本地ADAM。我们观察到此半径依赖于几个关键参数，包括功能条件号以及随机梯度的方差。通过对合成和现实世界基准数据集的数值模拟，我们的理论结果得到了进一步验证。

translated by 谷歌翻译

Local Relighting of Real Scenes

Audrey Cui , Ali Jahanian , Agata Lapedriza , Antonio Torralba , Shahin Mahdizadehaghdam , Rohit Kumar , David Bau

分类：计算机视觉

2022-07-06

我们介绍了本地重新考虑的任务，该任务通过打开和关闭图像中可见的光源来改变场景的照片。这项新任务与传统的图像重新确定问题不同，因为它引入了检测光源并推断出从它们中散发出的光模式的挑战。我们提出了一种用于本地重新考虑的方法，该方法通过使用另一个模型的合成生成的图像对来训练模型，而无需监督任何新型图像数据集。具体而言，我们从样式空间操纵的gan中收集了配对的训练图像；然后，我们使用这些图像来训练有条件的图像到图像模型。为了基于本地重新测试，我们介绍了Lonoff，这是一个在室内空间中拍摄的306张精确对齐图像的集合，其中灯的不同组合打开了。我们表明，我们的方法显着优于基于GAN倒置的基线方法。最后，我们演示了分别控制不同光源的方法的扩展。我们邀请社区解决这项新的当地重新任务。

translated by 谷歌翻译

EEPT: Early Discovery of Emerging Entities in Twitter with Semantic Similarity

Shahin Yousefi , Mohsen Hooshmand , Mohsen Afsharchi

分类：自然语言处理 | 机器学习

2022-07-06

未来发生的一些事件对于公司，政府甚至我们的个人生活可能很重要。在建立之前对这些事件的预测有助于有效的决策。我们称此类事件新兴实体。它们尚未发生，在KB中没有有关它们的信息。但是，有些线索存在于不同领域，尤其是在社交媒体上。因此，检索这些类型的实体是可能的。本文提出了一种早期发现新兴实体的方法。我们使用短消息的语义聚类。为了评估提案的绩效，我们设计和利用了绩效评估指标。结果表明，我们提出的方法发现了Twitter趋势并非总是能够有能力的那些新兴实体。

translated by 谷歌翻译

Distributed Online System Identification for LTI Systems Using Reverse Experience Replay

Ting-Jui Chang , Shahin Shahrampour

分类：机器学习 | (统计)机器学习

2022-07-03

线性时间流（LTI）系统的识别在控制和增强学习中起重要作用。文献中都对渐近时间和有限的离线系统识别进行了充分研究。对于在线系统识别，最近提出了具有反向体验重播（SGD RER）的随机梯度下降的想法，其中数据序列存储在几个缓冲区中，随机分脱水量（SGD）更新在每个缓冲区中向后进行，以使每个缓冲区向后进行。打破数据点之间的时间依赖关系。在这项工作的启发下，我们研究了通过多代理网络分布LTI系统的在线系统识别。我们将代理视为相同的LTI系统，网络目标是通过利用代理之间的通信共同估计系统参数。我们提出了DSGD-RER，SGD-RER算法的分布式变体，理论上表征了相对于网络大小的估计误差的改善。随着网络大小的增长，我们的数值实验证明了估计误差的减少。

translated by 谷歌翻译